1,781 research outputs found
Multilinguals and Wikipedia Editing
This article analyzes one month of edits to Wikipedia in order to examine the
role of users editing multiple language editions (referred to as multilingual
users). Such multilingual users may serve an important function in diffusing
information across different language editions of the encyclopedia, and prior
work has suggested this could reduce the level of self-focus bias in each
edition. This study finds multilingual users are much more active than their
single-edition (monolingual) counterparts. They are found in all language
editions, but smaller-sized editions with fewer users have a higher percentage
of multilingual users than larger-sized editions. About a quarter of
multilingual users always edit the same articles in multiple languages, while
just over 40% of multilingual users edit different articles in different
languages. When non-English users do edit a second language edition, that
edition is most frequently English. Nonetheless, several regional and
linguistic cross-editing patterns are also present
Modeling the Rise in Internet-based Petitions
Contemporary collective action, much of which involves social media and other
Internet-based platforms, leaves a digital imprint which may be harvested to
better understand the dynamics of mobilization. Petition signing is an example
of collective action which has gained in popularity with rising use of social
media and provides such data for the whole population of petition signatories
for a given platform. This paper tracks the growth curves of all 20,000
petitions to the UK government over 18 months, analyzing the rate of growth and
outreach mechanism. Previous research has suggested the importance of the first
day to the ultimate success of a petition, but has not examined early growth
within that day, made possible here through hourly resolution in the data. The
analysis shows that the vast majority of petitions do not achieve any measure
of success; over 99 percent fail to get the 10,000 signatures required for an
official response and only 0.1 percent attain the 100,000 required for a
parliamentary debate. We analyze the data through a multiplicative process
model framework to explain the heterogeneous growth of signatures at the
population level. We define and measure an average outreach factor for
petitions and show that it decays very fast (reducing to 0.1% after 10 hours).
After 24 hours, a petition's fate is virtually set. The findings seem to
challenge conventional analyses of collective action from economics and
political science, where the production function has been assumed to follow an
S-shaped curve.Comment: Submitted to EPJ Data Scienc
Deciphering implicit hate: evaluating automated detection algorithms for multimodal hate
Accurate detection and classification of online hate is a difficult task. Implicit hate is particularly challenging as such content tends to have unusual syntax, polysemic words, and fewer markers of prejudice (e.g., slurs). This problem is heightened with multimodal content, such as memes (combinations of text and images), as they are often harder to decipher than unimodal content (e.g., text alone). This paper evaluates the role of semantic and multimodal context for detecting implicit and explicit hate. We show that both text- and visual- enrichment improves model performance, with the multimodal model (0.771) outperforming other models' F1 scores (0.544, 0.737, and 0.754). While the unimodal-text context-aware (transformer) model was the most accurate on the subtask of implicit hate detection, the multimodal model outperformed it overall because of a lower propensity towards false positives. We find that all models perform better on content with full annotator agreement and that multimodal models are best at classifying the content where annotators disagree. To conduct these investigations, we undertook high-quality annotation of a sample of 5,000 multimodal entries. Tweets were annotated for primary category, modality, and strategy. We make this corpus, along with the codebook, code, and final model, freely available
Does Campaigning on Social Media Make a Difference? Evidence from candidate use of Twitter during the 2015 and 2017 UK Elections
Social media are now a routine part of political campaigns all over the
world. However, studies of the impact of campaigning on social platform have
thus far been limited to cross-sectional datasets from one election period
which are vulnerable to unobserved variable bias. Hence empirical evidence on
the effectiveness of political social media activity is thin. We address this
deficit by analysing a novel panel dataset of political Twitter activity in the
2015 and 2017 elections in the United Kingdom. We find that Twitter based
campaigning does seem to help win votes, a finding which is consistent across a
variety of different model specifications including a first difference
regression. The impact of Twitter use is small in absolute terms, though
comparable with that of campaign spending. Our data also support the idea that
effects are mediated through other communication channels, hence challenging
the relevance of engaging in an interactive fashion
Mapping the UK Webspace: Fifteen Years of British Universities on the Web
This paper maps the national UK web presence on the basis of an analysis of
the .uk domain from 1996 to 2010. It reviews previous attempts to use web
archives to understand national web domains and describes the dataset. Next, it
presents an analysis of the .uk domain, including the overall number of links
in the archive and changes in the link density of different second-level
domains over time. We then explore changes over time within a particular
second-level domain, the academic subdomain .ac.uk, and compare linking
practices with variables, including institutional affiliation, league table
ranking, and geographic location. We do not detect institutional affiliation
affecting linking practices and find only partial evidence of league table
ranking affecting network centrality, but find a clear inverse relationship
between the density of links and the geographical distance between
universities. This echoes prior findings regarding offline academic activity,
which allows us to argue that real-world factors like geography continue to
shape academic relationships even in the Internet age. We conclude with
directions for future uses of web archive resources in this emerging area of
research.Comment: To appear in the proceeding of WebSci 201
Petition Growth and Success Rates on the UK No. 10 Downing Street Website
Now that so much of collective action takes place online, web-generated data
can further understanding of the mechanics of Internet-based mobilisation. This
trace data offers social science researchers the potential for new forms of
analysis, using real-time transactional data based on entire populations,
rather than sample-based surveys of what people think they did or might do.
This paper uses a `big data' approach to track the growth of over 8,000
petitions to the UK Government on the No. 10 Downing Street website for two
years, analysing the rate of growth per day and testing the hypothesis that the
distribution of daily change will be leptokurtic (rather than normal) as
previous research on agenda setting would suggest. This hypothesis is
confirmed, suggesting that Internet-based mobilisation is characterized by
tipping points (or punctuated equilibria) and explaining some of the volatility
in online collective action. We find also that most successful petitions grow
quickly and that the number of signatures a petition receives on its first day
is a significant factor in explaining the overall number of signatures a
petition receives during its lifetime. These findings have implications for the
strategies of those initiating petitions and the design of web sites with the
aim of maximising citizen engagement with policy issues.Comment: To appear in proceeding of WebSci'13, May 1-5, 2013, Paris, Franc
Lost in Translation -- Multilingual Misinformation and its Evolution
Misinformation and disinformation are growing threats in the digital age,
spreading rapidly across languages and borders. This paper investigates the
prevalence and dynamics of multilingual misinformation through an analysis of
over 250,000 unique fact-checks spanning 95 languages. First, we find that
while the majority of misinformation claims are only fact-checked once, 11.7%,
corresponding to more than 21,000 claims, are checked multiple times. Using
fact-checks as a proxy for the spread of misinformation, we find 33% of
repeated claims cross linguistic boundaries, suggesting that some
misinformation permeates language barriers. However, spreading patterns exhibit
strong homophily, with misinformation more likely to spread within the same
language. To study the evolution of claims over time and mutations across
languages, we represent fact-checks with multilingual sentence embeddings and
cluster semantically similar claims. We analyze the connected components and
shortest paths connecting different versions of a claim finding that claims
gradually drift over time and undergo greater alteration when traversing
languages. Overall, this novel investigation of multilingual misinformation
provides key insights. It quantifies redundant fact-checking efforts,
establishes that some claims diffuse across languages, measures linguistic
homophily, and models the temporal and cross-lingual evolution of claims. The
findings advocate for expanded information sharing between fact-checkers
globally while underscoring the importance of localized verification
- …